Automatic Discovery Of Term Similarities Using Pattern Mining
نویسندگان
چکیده
Term recognition and clustering are key topics in automatic knowledge acquisition and text mining. In this paper we present a novel approach to the automatic discovery of term similarities, which serves as a basis for both classification and clustering of domain-specific concepts represented by terms. The method is based on automatic extraction of significant patterns in which terms tend to appear. The approach is domain independent: it needs no manual description of domain-specific features and it is based on knowledge-poor processing of specific term features. However, automatically collected patterns are domain specific and identify significant contexts in which terms are used. Beside features that represent contextual patterns, we use lexical and functional similarities between terms to define a combined similarity measure. The approach has been tested and evaluated in the domain of molecular biology, and preliminary results are presented.
منابع مشابه
Automatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining
Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...
متن کاملMining Term Similarities from Corpora*
In this article we present an approach to the automatic discovery of term similarities, which may serve as a basis for a number of term-oriented knowledge mining tasks. The method for term comparison combines internal (lexical similarity) and two types of external criteria (syntactic and contextual similarities). Lexical similarity is based on sharing lexical constituents (i.e. term heads and m...
متن کاملTemporal Databases and Frequent Pattern Mining Techniques
Data mining is the process of exploring and analyzing data from different perspective, using automatic or semiautomatic techniques to extract knowledge or useful information and discover correlations or meaningful patterns and rules from large databases. One of the most vital characteristic missed by the traditional data mining systems is their capability to record and process time-varying aspe...
متن کاملDiscovery of Frequent Patterns from Web Log Data by using FP-Growth algorithm for Web Usage Mining
Web usage mining refers to the automatic discovery and analysis of patterns in click stream and associated data collected or generated as a result of user interactions with web resources on one or more web sites. It consists of three phases which are data Preprocessing, pattern discovery and pattern analysis. In the pattern discovery phase, frequent pattern discovery algorithms applied on raw d...
متن کاملTactical Analysis Modeling through Data Mining - Pattern Discovery in Racket Sports
We explore pattern discovery within the game of tennis. To this end, we formalize events in a match, and define similarities for events and event sequences. We then proceed by looking at unbalancing events and their immediate prequel (using pattern masks) and sequel (using nondeterministic finite automata). Structured in this way, the data can be effectively mined, and a similar approach might ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002